15 research outputs found

    Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering

    Full text link
    Many vision and language tasks require commonsense reasoning beyond data-driven image and natural language processing. Here we adopt Visual Question Answering (VQA) as an example task, where a system is expected to answer a question in natural language about an image. Current state-of-the-art systems attempted to solve the task using deep neural architectures and achieved promising performance. However, the resulting systems are generally opaque and they struggle in understanding questions for which extra knowledge is required. In this paper, we present an explicit reasoning layer on top of a set of penultimate neural network based systems. The reasoning layer enables reasoning and answering questions where additional knowledge is required, and at the same time provides an interpretable interface to the end users. Specifically, the reasoning layer adopts a Probabilistic Soft Logic (PSL) based engine to reason over a basket of inputs: visual relations, the semantic parse of the question, and background ontological knowledge from word2vec and ConceptNet. Experimental analysis of the answers and the key evidential predicates generated on the VQA dataset validate our approach.Comment: 9 pages, 3 figures, AAAI 201

    LoNLI: An Extensible Framework for Testing Diverse Logical Reasoning Capabilities for NLI

    Full text link
    Natural Language Inference (NLI) is considered a representative task to test natural language understanding (NLU). In this work, we propose an extensible framework to collectively yet categorically test diverse Logical reasoning capabilities required for NLI (and by extension, NLU). Motivated by behavioral testing, we create a semi-synthetic large test-bench (363 templates, 363k examples) and an associated framework that offers following utilities: 1) individually test and analyze reasoning capabilities along 17 reasoning dimensions (including pragmatic reasoning), 2) design experiments to study cross-capability information content (leave one out or bring one in); and 3) the synthetic nature enable us to control for artifacts and biases. The inherited power of automated test case instantiation from free-form natural language templates (using CheckList), and a well-defined taxonomy of capabilities enable us to extend to (cognitively) harder test cases while varying the complexity of natural language. Through our analysis of state-of-the-art NLI systems, we observe that our benchmark is indeed hard (and non-trivial even with training on additional resources). Some capabilities stand out as harder. Further fine-grained analysis and fine-tuning experiments reveal more insights about these capabilities and the models -- supporting and extending previous observations. Towards the end we also perform an user-study, to investigate whether behavioral information can be utilised to generalize much better for some models compared to others.Comment: arXiv admin note: substantial text overlap with arXiv:2107.0722

    Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks

    Full text link
    Recent explorations with commercial Large Language Models (LLMs) have shown that non-expert users can jailbreak LLMs by simply manipulating the prompts; resulting in degenerate output behavior, privacy and security breaches, offensive outputs, and violations of content regulator policies. Limited formal studies have been carried out to formalize and analyze these attacks and their mitigations. We bridge this gap by proposing a formalism and a taxonomy of known (and possible) jailbreaks. We perform a survey of existing jailbreak methods and their effectiveness on open-source and commercial LLMs (such as GPT 3.5, OPT, BLOOM, and FLAN-T5-xxl). We further propose a limited set of prompt guards and discuss their effectiveness against known attack types

    Multilingual CheckList: Generation and Evaluation

    Full text link
    The recently proposed CheckList (Riberio et al,. 2020) approach to evaluation of NLP systems has revealed high failure rates for basic capabilities for multiple state-of-the-art and commercial models. However, the CheckList creation process is manual which creates a bottleneck towards creation of multilingual CheckLists catering 100s of languages. In this work, we explore multiple approaches to generate and evaluate the quality of Multilingual CheckList. We device an algorithm -- Automated Multilingual Checklist Generation (AMCG) for automatically transferring a CheckList from a source to a target language that relies on a reasonable machine translation system. We then compare the CheckList generated by AMCG with CheckLists generated with different levels of human intervention. Through in-depth crosslingual experiments between English and Hindi, and broad multilingual experiments spanning 11 languages, we show that the automatic approach can provide accurate estimates of failure rates of a model across capabilities, as would a human-verified CheckList, and better than CheckLists generated by humans from scratch

    Melting of the vortex lattice through intermediate hexatic fluid in a-MoGe thin film

    Full text link
    The hexatic fluid refers to a phase in between a solid and a liquid which has short range positional order but quasi-long range orientational order. In the celebrated theory of Berezinskii, Kosterlitz and Thouless and subsequently refined by Halperin, Nelson and Young, it was predicted that a 2-dimensional hexagonal solid can melt in two steps: first, through a transformation from a solid to a hexatic fluid which retains quasi long range orientational order and then from a hexatic fluid to an isotropic liquid. In this paper, using a combination of real space imaging and transport measurements we show that the 2-dimensional vortex lattice in a-MoGe thin film follows this sequence of melting as the magnetic field is increased. Identifying the signatures of various transitions on the bulk transport properties of the superconductor, we construct a vortex phase diagram for a two dimensional superconductor.Comment: New Data added in this versio

    Explainable Image Understanding Using Vision and Reasoning

    No full text
    Image Understanding is fundamental to intelligent agents.Researchers have explored Caption Generation and VisualQuestion Answering as independent aspects of Image Understanding (Johnson et al. 2015; Xiong, Merity, and Socher2016). Common to most of the successful approaches, are the learning of end-to-end signal mapping (image-to-caption, image and question to answer). The accuracy is impressive. It is also important to explain a decision to end-user(justify the results, and rectify based on feedback). Very recently, there has been some focus (Hendricks et al. 2016;Liu et al. ) on explaining some aspects of the learning systems. In my research, I look towards building explainableImage Understanding systems that can be used to generate captions and answer questions. Humans learn both from examples (learning) and by reading (knowledge). Inspired by such an intuition, researchers have constructed Knowledge-Bases that encode (probabilistic) commonsense and background knowledge. In this work, we look towards efficiently using this probabilistic knowledge on top of machine learning capabilities, to rectify noise in visual detections and generate captions or answers to posed questions
    corecore